Python:
R (this notebook):
detrend each ensemble member’s time series by calculating the loess residuals
calculate interannual standard deviation of each ensemble
member’s time series as sd((residuals))
calculate the 2080-2099 (or 1995-2014) average value (not detrended) across years for each ensemble member
for each ESM-experiment, calculate the ensemble average value of the interannual sd and the 2080-2099 average value for plotting
values in a region for the multimodel mean (more than models looked at here) can be sanity checked with:
https://interactive-atlas.ipcc.ch/regional-information
Unfortunately only maps of the multimodel mean.
# get ecs ordering/labels
esm_labels <- read.csv(paste0(timeseries_dir,'global_tas_allesms.csv'), stringsAsFactors = FALSE) %>%
select(esm) %>% distinct %>%
mutate(plotesm = paste0(letters[as.integer(row.names(.))], '.', esm),
ECS_order = as.integer(row.names(.)))
# load time series
x <- read.csv(paste0(timeseries_dir,'global_tas_allesms_timeseries_2015_2100.csv'),
stringsAsFactors = F)
process_time_series <- function(time_series_df, esm_label_info,
hist_start = 1995, hist_end = 2014){
# Get ensemble member values for projection runs:
# 1. 2080-2099 average
# 2. loess detrend each ensemble member to get IAV
#
# split by run
non_hist <- time_series_df %>% filter(experiment != 'historical')
grouped <- split(non_hist, f = list(non_hist$esm,
non_hist$experiment,
non_hist$ensemble,
non_hist$variable ) )
# split creates group of every possible combo of the variables and fills in
# empty dataframes for the ones that don't exist in data. Drop those
grouped <- grouped[lapply(grouped, nrow)>0]
processed_groups <- lapply(grouped, FUN = function(run_df){
loess_resids <- loess(run_df$ann_agg ~ run_df$year)$residuals
run_df %>%
filter(year >= 2080, year <= 2099) %>%
group_by(esm, experiment, ensemble, variable) %>%
summarise(average_2080_2099 = mean(ann_agg)) %>%
ungroup %>%
mutate(iasd = sd((loess_resids))) ->
output_df
return(output_df)
})
individual_stats <- do.call(bind_rows, processed_groups)
rm(non_hist)
rm(grouped)
rm(processed_groups)
# calculate ensemble averages
individual_stats %>%
group_by(esm, experiment, variable) %>%
summarise(average_2080_2099 = mean(average_2080_2099),
iasd = mean(iasd)) %>%
ungroup ->
ensemble_stats
# get ensemble average historical average value:
time_series_df %>%
filter(experiment == 'historical',
year >= hist_start,
year <= hist_end) %>%
group_by(esm, experiment, ensemble, variable) %>%
summarise(historical_average = mean(ann_agg)) %>%
ungroup %>%
group_by(esm, experiment, variable) %>%
summarise(historical_average = mean(historical_average)) %>%
ungroup %>%
select(-experiment) ->
historical_ens
# shape and calculate changes for plotting:
ensemble_stats %>%
left_join(historical_ens, by = c('esm', 'variable')) %>%
left_join(esm_label_info, by = 'esm') %>%
mutate(change = average_2080_2099 - historical_average,
pct_change = 100*(average_2080_2099 - historical_average)/historical_average) ->
plot_tbl
return(plot_tbl)
}
plot_tbl <- suppressMessages(process_time_series(time_series_df = x, esm_label_info = esm_labels))
global_tas <- x
global_tas_summary <- plot_tbl
## `summarise()` has grouped output by 'esm', 'experiment', 'ensemble'. You can
## override using the `.groups` argument.
## `summarise()` has grouped output by 'esm', 'experiment', 'ensemble'. You can
## override using the `.groups` argument.
## `summarise()` has grouped output by 'esm', 'experiment', 'ensemble'. You can
## override using the `.groups` argument.
## Warning: Removed 255 rows containing missing values (`geom_line()`).
## Warning: Removed 255 rows containing missing values (`geom_line()`).
## Warning: Removed 255 rows containing missing values (`geom_line()`).
Pick a couple models to compare differences between my Tgav and stitches package Tgav calculations
Models that have very different IASD plots to CT’s
cams <- read.csv('/Users/snyd535/Documents/GitHub/stitches/stitches/data/tas-data/CAMS-CSM1-0_tas.csv', stringsAsFactors = FALSE) %>%
select(-zstore)
x1 %>%
select(-nyear, -hist_avg) %>%
left_join(cams, by = c('esm'='model', 'variable', 'experiment', 'ensemble', 'year')) %>%
na.omit %>%
mutate(diff = ann_agg - value) ->
cams2
## [1] "====================================================="
## [1] "max difference between archive Tgav and stitches package Tgav for CAMS-CSM1-0_tas.csv:"
## [1] 2.273737e-12
## [1] "====================================================="
ukesm <- read.csv('/Users/snyd535/Documents/GitHub/stitches/stitches/data/tas-data/UKESM1-0-LL_tas.csv', stringsAsFactors = FALSE) %>%
select(-zstore)
x1 %>%
select(-nyear, -hist_avg) %>%
left_join(ukesm, by = c('esm'='model', 'variable', 'experiment', 'ensemble', 'year')) %>%
na.omit %>%
mutate(diff = ann_agg - value) ->
ukesm2
## [1] "====================================================="
## [1] "max difference between archive Tgav and stitches package Tgav for UKESM1-0-LL:"
## [1] 1.705303e-12
## [1] "====================================================="
canesm <- read.csv('/Users/snyd535/Documents/GitHub/stitches/stitches/data/tas-data/CanESM5_tas.csv', stringsAsFactors = FALSE) %>%
select(-zstore)
x1 %>%
select(-nyear, -hist_avg) %>%
left_join(canesm , by = c('esm'='model', 'variable', 'experiment', 'ensemble', 'year')) %>%
na.omit %>%
mutate(diff = ann_agg - value) ->
canesm2
## [1] "====================================================="
## [1] "max difference between archive Tgav and stitches package Tgav for CanESM5:"
## [1] 9.094947e-13
## [1] "====================================================="
miroc <- read.csv('/Users/snyd535/Documents/GitHub/stitches/stitches/data/tas-data/MIROC6_tas.csv', stringsAsFactors = FALSE) %>%
select(-zstore)
x1 %>%
select(-nyear, -hist_avg) %>%
left_join(miroc , by = c('esm'='model', 'variable', 'experiment', 'ensemble', 'year')) %>%
na.omit %>%
mutate(diff = ann_agg - value) ->
miroc2
## [1] "====================================================="
## [1] "max difference between archive Tgav and stitches package Tgav for MIROC6:"
## [1] 1.932676e-12
## [1] "====================================================="
Exact same data prep
Arizona, New Mexico, Colorado, Utah
Assorted 2d and 3d scatters show sampling is
worth doing across 12 ESMs because the deltaT dimension alone hides things
not leaving major holes across our identified space of deltaT, iasd of T, deltaP at different spatial scales
global P didn’t make sense to look at to me but I could do land mask T and P maybe instead of the global calculation?
could use Noah checking my work on masking since still getting different ensemble average iasd values?
For the paper, envisioning some of these scatters to show it’s worth looking at all 12 and not fewer ESMs, and then doing the parameter estimation for stitches functions. And the curated data set is then a zenodo archive of the generated ensemble members + script to pull true runs
is it worth characterizing across more dimensions of uncertainty? Other variables or maybe some kind of spatial range metric in addition to AOI? I started thinking about some HC analysis on the 3dimensions we’ve identified but I’m not sure it’s worth it? https://uc-r.github.io/hc_clustering#replication
Otherwise in general, I’m aiming to spend next week really harassing CRV about joss paper so we can get it in
then focus on the parameter estimation for stitches for these 12
ESMs, generating a bunch of runs and writing up the paper using some of
the above figures? Some small stitches improvements to go along with
that, mostly so that the ESMs on our list with grids labeled other than
gn are included in the package data.
Then as we start thinking about demonstration, are we thinking that will be a feed forward experiment or are we going to try to do feedbacks?
feedbacks, we’ll have to do some clean up so that the matching happens time step by time step but we still control for collapse and make sure the final time series constructed this way still behaves. Should be ok but mechanical code changes to be sure.
or are we thinking to take some standard GCAM runs, take the Hector Tgav from those, and look at different spatial outcomes on these novel pathways? Same Hector Tgav but there’s X many spatially different worlds that are consistent with that Tgav and is there something interesting and in scope that we could say about those spatially different worlds without necessarily having to do feedbacks? and maybe even without having to then run those through impacts models even…
ultimately will come down to memory? the overshoots rely kind of heavily on land carbon sinks to get to the targets so I could see memory being more important for our usual variables than in the monotonic scenarios
in general, we’re trying to think if we can time reverse the monotonic scenarios and use those for overshoots. doing reflection across the xaxis of our 2d matching space. a good amount of code changes and it will be clunky to do it but we can. Guess the question is if we think there’s any chance of it working before we embark on the code changes
what would we do with the months even if we did convince ourselves it would work? We have a target point from an overshoot and we find a match for it in the (T, -X*dT) archive space
so 2080-2089 target gets matched to 2059-2050 window. Then is it Jan 2080 = Jan 2059, Jan 2081= Jan 2058 and so on (reverse years but keep months in order in each year) or do we have to reverse the entire thing? That would be Jan 2080 = Dec 2059, Feb 2080 = Nov 2059 and seasonality would be all messed up. So we have to keep months in order so that a January gets matched to a January and so on. Then we’re mixing forward time flow (each year) with backwards (across years) and there’s almost no way that results in sensible time series? (or time series mathematically indistinguishable from sensible?)
Or IS (January to February to March) indistinguishable from (December to November to October) - from a land carbon stocks perspective, I don’t see how it can be but maybe that’s too in the weeds. Maybe that’s the first investigation - if I do forward month-to-month differences and compare to backward ones, how different do they end up? For sure we wouldn’t be able to extend to daily data but maybe seasonal signals from a monthly perspective are symmetric enough. Pay special attention to grid cells with monsoon seasons? Could arguably restrict investigation to like. Just points in (T, X*dT) space with T<=2.5C. Start with one ESM, figure out tests and code, extend
from there, check if it breaks ENSO? If those all look reasonable, code changes to actually stitch on reversed time may be worth doing
do also have points, especially in historical period, where dT is negative already. Could do a controlled test and compare two nine year chunks from a single ESM that are at the same T and similar magnitude dT. Start the above comparisons focusing on those windows, then extend to projections. 126 and 245 will have natural points with negative dTs we can repeat the test on.
And we know stitches should work on the monotonic (first) part of an overshoot time series, it’s just once the draw down kicks in that we have to get creative and validate.
Emulators work by relying on local variables being slaved to global temp in at least some way even if we can’t write equations to describe the way. That’s why stitches is cool, it doesn’t make assumptions about the way.
Suppose the way that relationship works in the monotonic scenarios is fundamentally different than the relationship in overshoot scenarios. But stitches is ‘learning’ the monotonic assumption, even if we reverse dT, no? It’s assuming the local-global mapping in the overshoot scenarios is not fundamentally, structurally different from that, that flipping time is enough.
So if our generated time series doing this are trash (or if our comparison of naturally postive and negative dT windows of actual ESM data fails), then what is that telling us? Is there anything we can learn from that difference that’s useful? I have very vague memories of talking to BK about using that to estimate hysteresis when he and I were brainstorming papers at some point
is that somewhere mean fields can help us hammer out where the failure is coming from, if it fails? linear patterns ARE perfectly time reversible.
Is that a useful test? take an SSP126 pattern, run an SSP119 tgav through it. Focus on the negative dT years and compare the pattern generated maps to the ensemble average/smoothed maps? Would want to start with an ESM that not only ran SSP119 but maybe even ran it for more than one ensemble member.
Then the final practical consideration is how do we frame it into a GCIMS paper